Add a new packaging.dependency_groups module, based on the existing dependency-groups package#1065
Add a new packaging.dependency_groups module, based on the existing dependency-groups package#1065
packaging.dependency_groups module, based on the existing dependency-groups package#1065Conversation
fa0636a to
b9cee56
Compare
I think it might make sense to start using dataclasses where it simplifies the code, but dataclasses don't play well with slots until Python 3.10, so at least for performance sensitive code it's best to hold off until Python 3.10 can be required. |
|
You can add slots to dataclasses before 3.10, you just can't auto-generate them. And when you auto-generate them, it actually makes a new class rather than adding them to the current one (which isn't allowed), so technically adding them manually might be a hair better. |
|
Is it possible to do exceptiongroups? We have exception groups already I believe1, and I'll be using them for #847. (Haven't checked the code yet, but just a quick thought). Footnotes
|
I take this to her a reference to the error collection pattern I used for duplicate names after normalization. My understanding of exception groups is that the target use cases are capturing and propagating multiple unrelated errors -- that's what the original PEP proposed it for. But maybe the practice has evolved beyond those use cases? I haven't used the feature much but am very willing to learn. Anyway, I'm happy to make any changes requested, but since I'm a little unclear on this one I'm leaving it as-is at the moment. |
|
In pyproject-metadata, it's used to collect and present multiple unrelated errors in the pyproject.toml. So if a user makes two mistakes, both can be shown, rather than having to fix one at a time then see what else breaks. They do need to be unrelated, otherwise you can't produce both errors. For example: [dependency-groups]
test = ["one"]
Test = ["two"]
dev = ["one"]
Dev = ["two"]
start = { include-group = "end" }
end = { include-group = "start" }
something = {}I could imaging that throwing a group with exceptions: ExceptionGroup([
DuplicateGroupNames("test"),
DuplicateGroupNames("dev"),
CyclicDependencyGroup("start -> end -> start"),
InvalidDependencyGroupObject("something"),
])If it's something you are interested in, I can separate out the Maybe errors in dependency-groups are too rare for this to matter? But it's good to get the API in if you do want it, as users have to be aware of the exception types. In pyproject-metadata, you have to opt into groups because of the API change, it can't easily be added after an API is out. |
|
Oh, I see. If we allow all of the categories of errors which may be raised to be grouped, we will want an exception group, yes. I don't know if it's practical to do that for these errors? For example, consider this data: [dependency-groups]
# a cycle
start = [{ include-group = "end" }]
end = [{ include-group = "start" }]
# "fix" the cycle with a duplicate name
START = []The current implementation emits If we wanted to proceed and find the cycle as well, we'd have to do branched evaluation both against But in general, I like collecting up multiple errors and reporting them back when doing so doesn't impose significant extra burdens. I could easily see raising, for your original example, ExceptionGroup([
DuplicateGroupNames("test (test, Test)"),
DuplicateGroupNames("dev (dev, Dev)"),
])Is that worth doing on its own? We could also maybe group [dependency-groups]
# neither element is valid
foo = [{}, {include-groups = ["bar", "baz"]}]I'm generally inclined towards the simpler fail-fast behavior, but don't want to tie our hands WRT future enhancements. |
|
Reporting as many errors as possible saves human debugging time, so it's worth a little processing time; I'm not sure I know of a situation where parsing these would really need abort-early performance over fully reporting all the errors. I can try this to see how complex it looks, it's generally manageable with some structure. |
|
That's fair; I can't think of a performance-critical use case for dependency groups, especially on malformed data. I think the thing that troubles me is handling duplicate denormed names. Do we really want to expand the full tree of possible combinations? Here's the sort of data I have in mind: [dependency-groups]
start = [{include-group = "end"}]
end = [{include-group = "start"}]
START = []
END = []
contains = [{include-group = "start"}]There are 4 combinations, but the cycle only happens with The only thing I can think to do is to branch, and check for any errors from If it's acceptable to, in this case, raise an error that Cases with no duplicate names, like this example, are much easier to handle by "collecting all errors": [dependency-groups]
start1 = [{include-group = "end1"}]
end1 = [{include-group = "start1"}]
start2 = [{include-group = "end2"}]
end2 = [{include-group = "start2"}]
contains = [{include-group = "start1"}, {include-group = "start2"}] |
|
I'm not that worried about related errors; you could simply decide an order, and remove them. So if |
|
Ah, okay! That makes a lot of sense to me. Expanding the matrix of possibilities seems like more complexity than it's worth, but pruning out duplicates (or just picking one, by some deterministic rule) seems easy. |
|
Feel free to use |
|
Ahh, ExceptionGroup is from metadata.py, I knew it was in packaging somewhere already. I've opened #1071 with errors.py. |
The primary components of the `dependency-groups` package, which implements PEP 735, are added in a new module, `packaging.dependency_groups`. The CLI components of `dependency-groups` are ignored here. Some changes are made, both to adapt the implementation to fit into `packaging` and to cleanup issues spotted during this pass: - `dependency-groups` raises `ValueError` directly for several forms of bad data. `packaging` modules prefer their own subclasses of `ValueError`, and the error behaviors are therefore adjusted to match. - `dependency-groups` does runtime type checks on several function args with `isinstance`, raising `TypeError` if the wrong type is provided. As `packaging` modules never seem to do this otherwise, these checks are removed, in favor of relying on type annotations to document proper usage. - Some of the annotations in `dependency-groups` are subtly wrong, using types like `dict` where `Mapping` is more appropriate. Fixing `dict` and `list` to `Mapping` and `Sequence` generally covers this. Because `str` satisfies `Sequence[str]`, one instance of an explicit check for `str` data is added. - `DependencyGroupInclude` is converted from a dataclass to a plain class with slots. This seems to match other models in `packaging` (there are no other uses of `dataclasses`). It is also given a repr, to match the new doc page's doctest content. - The functional interface for `dependency-groups`, `resolve()`, is renamed to the more verbose `resolve_dependency_groups()`. This makes it less ambiguous when used in a from-import, which is common style for `packaging` usage. - `dependency-groups` documents the errors which may be raised by various methods, but this has been removed for two reasons. First, this was less laborious when plain `ValueError`s were raised, and it's now more to maintain. Second, it's a maintenance burden to ensure these are documented properly and there's no known error path which is missing. The testsuite from `dependency-groups` is ported over nearly verbatim. Some tests need updates to match the above changes, and all need minor updates to match style rules in `packaging`.
Apply the error collector type throughout the dependency group resolver to collect multiple errors at once for various cases. In most scenarios, this is a trivial wrapper over the existing exception logic. Because the dependency group resolver is lazy in most respects, it is usually capturing one error at a time. However, the main loop for expanding groups recursively is able to process all groups without stopping on the first error. Notably, resolution checks for errors and exits without caching results. This ensures that accessing a malformed group repeatedly on a single resolver instance raises multiple errors.
b9cee56 to
96a39e3
Compare
|
I've rebased and adapted to the ExceptionGroups/ErrorCollector paradigm from #1071 . Two interesting bits related to cycle detection:
EDIT: CI is big mad because |
In recent discussions within `packaging`, maintainers tried to determine why some object reprs had `<>` and some did not. No clear reason was found for using these.
group_contains requires that you inherit from the exception group backport on lower Pythons, which packaging does not do. To resolve, add a helper which looks pretty similar to the `group_contains()` API.
e303ddc to
174c798
Compare
henryiii
left a comment
There was a problem hiding this comment.
Looks good, a couple of minor things.
src/packaging/dependency_groups.py
Outdated
|
|
||
|
|
||
| def _normalize_name(name: str) -> str: | ||
| return re.sub(r"[-_.]+", "-", name).lower() |
There was a problem hiding this comment.
Not too worried about performance, but an easy thing would be to make this a compiled re.
src/packaging/dependency_groups.py
Outdated
| ) | ||
| ) | ||
| else: | ||
| include_group = next(iter(item.values())) |
There was a problem hiding this comment.
This is the same as include_group = item["include-group"]? Might that read a bit better?
You could also do:
values = item.values()
if len(values) != 1 or values[0] != "include-group":
...
else:
include_group = values[1]But I don't think that really reads better or worse.
There was a problem hiding this comment.
I really have no clue why I wrote it this way. It's been in the implementation from the beginning (all the way back to the reference version in the PEP) and it looks really weird to me now that I'm looking at it again. It's a bit of a head-scratcher -- best, I think, not to worry about how or why it came out that way at first.
I just updated it to item["include-group"] as part of a co-authored commit with your various suggestions. I think that's the best in terms of readability.
Co-authored-by: Henry Schreiner <4616906+henryiii@users.noreply.github.com>
This was initially raised in the PyPA discord.
The primary components of the
dependency-groupspackage, which implements PEP 735, are added in a new module,packaging.dependency_groups.The CLI components of
dependency-groupsare ignored here.Some changes are made, both to adapt the implementation to fit into
packagingand to cleanup issues spotted during this pass:dependency-groupsraisesValueErrordirectly for several forms ofbad data.
packagingmodules prefer their own subclasses ofValueError, and the error behaviors are therefore adjusted to match.dependency-groupsdoes runtime type checks on several function argswith
isinstance, raisingTypeErrorif the wrong type is provided.As
packagingmodules never seem to do this otherwise, these checksare removed, in favor of relying on type annotations to document proper
usage.
Some of the annotations in
dependency-groupsare subtly wrong, usingtypes like
dictwhereMappingis more appropriate. Fixingdictand
listtoMappingandSequencegenerally covers this. BecausestrsatisfiesSequence[str], one instance of an explicit check forstrdata is added.DependencyGroupIncludeis converted from a dataclass to a plain classwith slots. This seems to match other models in
packaging(there areno other uses of
dataclasses). It is also given a repr, to match thenew doc page's doctest content.
The functional interface for
dependency-groups,resolve(), isrenamed to the more verbose
resolve_dependency_groups(). This makesit less ambiguous when used in a from-import, which is common style for
packagingusage.dependency-groupsdocuments the errors which may be raised by variousmethods, but this has been removed for two reasons. First, this was
less laborious when plain
ValueErrors were raised, and it's now moreto maintain. Second, it's a maintenance burden to ensure these are
documented properly and there's no known error path which is missing.
The testsuite from
dependency-groupsis ported over nearly verbatim.Some tests need updates to match the above changes, and all need minor updates to match style rules in
packaging.